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EXECUTIVE  SUMMARY 


Assessing  and  understanding  operator  workload  is  an  important  factor  for  consideration  during  the 
development  of  new  systems.  It  may  also  be  important  to  understand  fluctuations  in  workload  within 
operational  systems  in  order  to  efficiently  apply  automated  processes  and  provide  assistance  at  criti¬ 
cal  times.  This  paper  describes  how  a  simple  workload  measure  obtained  every  2-3  minutes  during 
the  evaluation  of  a  prototype  command  and  control  console  can  be  used  to  develop  an  operator’s 
workload  profile  as  a  function  of  other  system  parameters,  such  as  track  density  on  the  tactical  plot, 
and  task  loading.  If  system  measures  can  be  monitored  regularly,  functional  models  of  operator 
workload  can  be  derived,  and  workload  levels  can  be  interpolated  to  provide  near-continuous  work¬ 
load  estimates  every  15-30  seconds.  The  resulting  workload  profiles  can  be  used  to  identify  condi¬ 
tions  that  result  in  potential  operator  overload.  Profiles  from  several  operators  may  be  used  to  study 
team  workload  distribution  and  to  derive  more  efficient  work  allocation  strategies. 

Also  discussed  in  the  present  work  is  how  the  simple  unidimensional  workload  measure  relates  to 
multidimensional  measures  that  differentiate  between  mental  demand,  physical  demand,  fhistration, 
and  other  aspects  of  work.  Multidimensional  measures  require  more  extensive  reporting  and  are  thus 
not  suitable  for  administration  during  system  testing.  A  common  multidimensional  scale,  the  NASA 
Task  Load  Index  (TLX),  was  administered  at  the  conclusion  of  the  each  evaluation  session.  Regres¬ 
sion  analysis  revealed  that  the  90th  percentile  from  the  distribution  of  unidimensional  workload  esti¬ 
mates  related  to  the  NASA-TLX  dimensions  of  mental  effort  and  temporal  demand  in  a  group  of  20 
operators. 

These  findings  indicate  that  near-continuous  workload  profiles  may  be  built  from  simple  subjec¬ 
tive  workload  estimates  combined  with  system-state  information,  and  that  the  workload  estimates  can 
be  linked  to  specific  behavioral  dimensions  as  captured  by  more  complex  workload  assessment 
scales. 


CONTENTS 


EXECUTIVE  SUMMARY . iii 

INTRODUCTION .  1 

EXPERIMENT  1 .  3 

METHODS .  3 

RESULTS  AND  DISCUSSION .  4 

EXPERIMENT  2 .  8 

METHODS .  8 

RESULTS  AND  DISCUSSION .  8 

SUMMARY .  10 

REFERENCES .  13 

Figures 

1 .  Task  Manager  display  used  in  the  experiments.  Each  icon  represents  a  task, 

which  was  triggered  by  the  system .  3 

2.  Subjective  workload  estimation  prompt  that  appeared  on  the  subject’s  display 

every  2  minutes  during  the  30-min  scenario .  4 

3.  Track  density  on  the  tactical  plot  during  the  30-min  scenario .  5 

4.  Mean  task  load  (left  axis,  solid  line)  and  mean  subjective  workload  (right  axis,  open 

circles)  as  a  function  of  time  (in  sec)  for  the  eight  subjects.  Mean  task  load  was 
obtained  every  1 5  sec.  Mean  subjective  estimates  were  forward-lagged  by  60  sec. ...  5 

5.  Workload  estimates  produced  every  15  sec  by  the  general  and  individual  regression 

models  developed  from  the  15  subjective  workload  estimation  points .  7 

6.  Moment-to-moment  estimated  workload  data  for  Subject  2  based  upon  the  general 
model  and  an  individually  tailored  model.  Original  workload  estimates  provided  by 

the  subject  during  the  scenario  are  shown  as  open  circles .  7 

7.  Task  load  (solid  line)  and  selection  activity  (dashed  line)  as  a  function  of  time  for  one 

subject.  Subjective  estimates  of  workload  are  presented  as  filled  squares .  1 1 

8.  Regression  model-generated  estimated  workload  profile  for  one  subject  during 

a  40-min  ADW  scenario  based  upon  continuous  track  density,  task  load,  and 
operator  activity  measures .  12 

Tables 

1.  Individual  models .  6 

2.  Correlation  matrix  of  workload  estimates  and  NASA-TLX  subscale  means .  9 


V 


INTRODUCTION 


The  confluence  of  increased  computing  power,  pressure  to  increase  productivity,  and  efforts  to 
reduce  costs  associated  with  human  oversight  within  process  control,  manufacturing,  and  command 
and  control  systems  promises  a  greater  role  for  automation  in  the  future.  Currently,  there  is  an 
emphasis  on  maintaining  or  even  reducing  manning  levels  within  new  systems.  It  is  quite  evident  that 
future  operators  will  be  required  to  supervise  automated  processes  and  work  with  automation  in  a 
manner  not  seen  previously. 

Automation  has  been  of  interest  to  system  developers  for  many  years,  and  studies  have  generally 
shown  that  while  system  performance  can  generally  be  improved,  performance  may  worsen  under 
certain  conditions.  Failure  of  automation  occurs  most  readily  in  systems  that  cannot  be  fully  auto¬ 
mated,  and  within  which  human  operators  must  actively  monitor  and  occasionally  over-ride  auto¬ 
mated  processes.  Automation  within  dynamic  settings  can  increase  the  workload  of  operators 
because  of  the  extensive  dialog  with  automated  processes  necessary  to  ensure  proper  functioning. 
Operator  reliance  on  automation  can  result  in  a  loss  of  situational  awareness  eind  complacency;  this  is 
problematical  within  systems  requiring  occasional  operator  control.  Finally,  operators  may  experi¬ 
ence  a  loss  of  expertise  as  direct  involvement  in  system  control  declines. 

Previous  research  has  established  that  truly  adaptive  systems  will  require  information  on  the 
human  operator’s  workload  levels  in  real  time  (e.g.,  Byrne  &  Parasuraman,  1996).  Parasuraman  et  al. 
(1992)  have  proposed  that  a  combination  of  three  assessment  domains  (environmental,  activity, 
operator  state)  can  provide  estimates  of  workload  with  greater  stability  than  any  subset  of  measures. 
Environment  or  system-state  information  refers  to  knowledge  of  an  operator’s  task  loading.  For 
example,  the  number  of  aircraft  that  must  be  monitored  by  an  air  traffic  controller  may  provide  a 
general  indication  of  workload.  Communication  activity  (monitored  on  a  radio  circuit)  might  reflect 
the  extent  to  which  a  set  of  aircraft  requires  attention  by  the  controller.  Psychophysiological  meas¬ 
ures,  (e.g.,  heart  rate  variability,  electroencephalograph  [EEC]  spectral  measures)  provide  insight 
regarding  an  operator’s  psychophysiological  state,  which  in  turn  may  correlate  with  workload 
(Kramer,  Trejo  &  Humphrey,  1996;  Van  Orden,  Jung  &  Makeig,  2000;  Van  Orden,  et  al.  2001). 

While  recent  studies  have  suggested  that  psychophysiological  and  behavioral  models  are  useful  for 
determining  operator  state  to  some  degree,  current  findings  from  our  laboratory  indicated  that  greater 
fidelity  in  the  estimation  of  workload  may  be  achieved  from  more  precise  modeling  of  the  operator’s 
task  environment.  During  the  course  of  developing  a  prototype  command  and  control  console  for  air 
defense  warfare  (ADW),  Osga  et  al.  (2001)  focused  on  a  Task-Centric  Design  (TCD)  approach  to 
meet  the  simultaneous  requirements  of  reduced  system  manning  and  improved  mission  effectiveness. 
This  design  approach  enabled  moment-to-moment  tracking  of  tasks  to  be  performed  by  an  operator. 
TCD  was  bom  from  the  realization  that  in  order  to  reduce  workload  and  assist  the  operator,  the 
system  must  contain  some  knowledge  of  what  the  operator  is  attempting  to  accomplish.  Hoc  (2000) 
explains  this  in  terms  of  a  common  frame  of  reference  (COFOR)  between  the  operator  and  machine. 
System  “awareness”  of  task  state  and  operator  intent  enables  organization  of  task-supportive 
information,  and  development  of  tools  to  assist  the  operator  in  a  manner  that  is  specifically  task 
supportive.  For  ADW,  TCD  required  continuous  assessment  of  track  data  to  trigger  tasks,  the 
development  of  information  sets  to  support  those  tasks,  and  the  preparation  of  task  products 
(outgoing  reports,  recommended  tactical  actions)  for  review  by  the  operator.  The  goal  in  TCD  is  to 
support  the  operator  through  all  task  phases,  from  initiation  to  transition  to  new  tasks.  A  central 
design  feature  is  a  task  manager  algorithm  and  display,  which  presents  icons  to  the  operator  based 
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upon  the  system’s  assessment  of  changes  within  the  tactical  database  that  require  response  or  action 
(see  Osga  et  al.,  2001). 

System  initiation  and  recording  of  task  information  proved  to  be  highly  valuable  for  moment-to- 
moment  estimation  of  operator  workload.  ADW  requires  that  tasks  be  responded  to  rapidly  after  they 
have  been  instantiated,  reducing  the  likelihood  that  tasks  will  not  be  attended  to  for  prolonged  peri¬ 
ods.  The  ADW  tasks,  as  structured  and  supported  by  information  sets,  are  relatively  straightforward 
in  terms  of  actions  and  generally  require  confirmation  by  the  operator  to  issue  messages,  query  tracks 
of  interest,  and  order  air  assets  to  inspect  suspicious  radar  contacts. 

In  Experiment  1 ,  our  goal  was  to  determine  the  extent  to  which  workload  could  be  monitored  using 
the  frequency  of  tasks  posted  to  the  task  manager  display  and  the  local  track  density  of  the  tactical 
display.  It  was  expected  that  real-time  monitoring  of  tasks  as  they  appeared  on  the  task  manager 
display  would  enable  more  precise  real-time  monitoring  of  operator  workload.  A  simple  unidimen¬ 
sional  workload  estimation  technique  was  employed,  allowing  non-obtrusive  estimation  by  operators 
throughout  a  30-min  air  defense  scenario.  This  unidimensional  estimate  was  strongly  associated  with 
tactical  plot  track  density  and  the  frequency  of  tasks  identified  by  the  system.  In  Experiment  2,  the 
relationship  of  the  unidimensional  workload  measure  to  a  summary  multidimensional  measure 
(NASA-Task  Load  Index  [TLX])  was  examined.  Methodological  and  theoretical  issues  as  they  relate 
to  the  application  of  automation  within  cooperative  human-machine  systems  are  subsequently 
discussed. 


2 


EXPERIMENT  1 


METHODS 

Participants:  Eight  subjects,  affiliated  with  local  commands,  volunteered  to  complete  a  30-min 
ADW  task.  All  were  generally  familiar  with  ADW  concepts  and  operator  activities. 

Apparatus  and  Procedure:  The  experiment  was  run  on  a  personal  computer  interfaced  with  two 
flat-panel  color  displays.  The  displays  were  arranged  vertically  with  a  tactical  plot  and  associated 
information  windows  regarding  tactical  vehicles  in  the  upper  display,  and  a  “task  manager”  display 
(Figure  1)  located  below.  Subjects  could  interact  with  either  display  using  touch  or  with  a  trackball. 


Figure  1 .  Task  Manager  display  used  in  the  experiments.  Each  icon  represents  a  task,  which 
was  triggered  by  the  system. 

During  a  30-niin  air  defense  scenario,  subjects  were  required  to  respond  to  system-initiated  tasks 
(appearing  as  icons  in  the  Task  Manager  display)  concerning  reports  to  be  generated  on  the  occur¬ 
rence  of  new  tracks,  track  identification  changes,  and  uncorrelated  electronic  surveillance  measures 
(ESM)  activity.  Initiating  a  task  would  highlight  the  pertinent  track  on  the  tactical  plot,  present  the 
track’s  summary  information  within  summary  information  set  windows,  and  produce  a  product  (e.g., 
outgoing  message,  tactical  action  to  be  ordered)  for  review.  Subjects  were  also  required  to  observe 
the  tactical  plot  and  initiate  level  1  queries  (“who  are  you”  questions)  and  level  2  warnings  (“turn 


3 


away”  statements)  to  unknown  and  suspect  aircraft  that  crossed  predetermined  standoff  distances 
from  their  ship.  Standoff  ranges  were  determined  by  range  rings  surrounding  the  ship  and  by  graph¬ 
ics  depicting  the  boundary  between  territorial  and  international  waters.  Subjects  were  required  to 
enter  a  simple  workload  estimate  (7-point  scale)  every  2  minutes  (see  Figure  2).  Roscoe  (1987)  has 
successfully  used  a  similar  method  of  eliciting  subjective  workload  estimates  from  pilots  involved 
with  dynamic  flight  activity  with  little  task  interruption.  His  work  was  based  upon  a  scale  developed 
by  Cooper  and  Harper  (1969)  for  evaluation  of  pilots’  perception  of  aircraft  handling  characteristics. 
Denominations  on  these  earlier  10-point  scales  were  tied  to  a  semantic  decision  tree  regarding  “toler¬ 
ability”  and  “spare  capacity”  of  perceived  workload  during  the  task.  Roscoe  concluded  that  the 
resulting  output  of  the  instrument  was  nonlinear  with  respect  to  task  load.  The  7-point  scale  used  in 
the  present  study  was  anchored  only  by  the  descriptors  shown  in  Figure  2.  No  nonlinearities  were 
observed  in  the  present  data. 

The  general  uniformity  among  the  tasks  we  studied  allowed  them  to  be  considered  as  equivalent 
units  of  work;  there  was  no  need  to  apply  distinct  visual,  auditory,  cognitive,  and  psychomotor  work¬ 
load  estimates  (see  McCracken  and  Aldrich,  1984)  to  each  task  in  order  to  conduct  workload  mod¬ 
eling  studies.  The  queries  and  warnings  tasks  were  initiated  by  the  subjects  and  scored  as  three  units 
of  task  work  given  the  degree  of  track  monitoring  necessary  to  complete  these  tasks. 


Figure  2.  Subjective  workload  estimation  prompt  that  appeared  on  the  subject’s  display  every  2 
minutes  during  the  30-min  scenario. 


RESULTS  AND  DISCUSSION 

Figure  3  presents  the  target  density  data  for  the  30-min  period  of  the  scenario.  These  data  are 
equivalent  for  each  subject.  Figure  4  presents  the  mean  task  loading  data,  obtained  every  15  sec,  for 
the  eight  subjects.  Also  plotted  are  the  mean  subjective  workload  estimates.  While  generally  similar 
between  subjects,  task  load  data  could  vary  as  a  function  of  how  rapidly  operators  completed  tasks. 
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A  regression  analysis  was  conducted  using  the  target  density  and  mean  task  load  data  to  predict 
mean  workload  estimation  data.  This  analysis  indicated  that  for  all  subjects,  mean  task  load  and  track 
density  accounted  for  a  significant  portion  of  the  variance  (57  percent)  in  estimated  workload, 

F(2,12)  =  8.12,  p  <  .01,  R  =  0.76.  The  general  model  was: 

Wkld  =  0.45  *  Task  Load  +  0.05  *  Target  Density  +  1.32 

For  the  general  model,  task  load  (TL)  accounted  for  39  percent  of  the  variance,  while  target  den¬ 
sity  (TD)  accounted  for  19  percent. 

Individual  models  were  constructed  for  each  subject;  their  parameters  are  presented  in  Table  1. 
These  models  accounted  for  a  statistically  significant  portion  of  estimated  workload  variance  for  six 
of  eight  subjects,  and  demonstrated  considerable  variability  in  component  weighting  factors.  Individ¬ 
ual  and  the  general  models  based  on  15  subjective  estimates  could  then  be  used  to  interpolate  work¬ 
load  at  15-sec  intervals  for  the  duration  of  the  scenario — as  would  be  desired  in  a  functional  real-time 
system. 


Table  1.  Individual  models. 


Subject 

TL 

TD 

Intercept 

R 

SI 

.29 

.09* 

.40 

.76* 

S2 

.21 

.12* 

-.94 

.71* 

S3 

.82 

.08 

.66 

.68* 

S4 

.42 

-.14 

7.78 

.60 

S5 

.58* 

.03 

1.26 

.73* 

S6 

-.09 

.16* 

.12 

.78* 

S7 

-.02 

.08* 

-.06 

.57 

S8 

1.18* 

.003 

2.06 

.69* 

GenMod: 

.45* 

.05* 

1.32 

.76* 

*  indicates  significance  at  p  <  0.05 


Figure  5  presents  moment-to-moment  workload  data  derived  from  the  general  and  individual  mod¬ 
els  for  Subject  1 .  As  shown,  the  general  and  individual  models  produced  similar  workload  profiles 
for  this  subject.  Figure  6  presents  similar  output  data,  along  with  the  original  subjective  workload 
estimates,  for  Subject  2.  The  difference  between  the  workload  profiles  produced  by  the  general  and 
individual  models  are  considerable  for  this  subject,  and  raises  some  important  questions.  For  exam¬ 
ple,  is  this  subject  truly  different  from  the  group — and  does  the  subject  have  the  excess  workload 
capacity  indicated  by  the  individual  model?  Or  is  his  estimation  scale  biased  towards  using  low  num¬ 
bers  compared  to  other  subjects?  In  this  case,  the  excess  capacity  for  additional  work  might  evapo¬ 
rate  rapidly  when  this  subject  is  challenged  with  additional  tasks.  Roscoe  (1987)  noted  that  individual 
variability  in  subjective  ratings  was  common  while  using  a  similar  instrument.  The  key  to  addressing 
this  issue  is  repeated  testing  under  conditions  that  drive  the  subjects  into  an  overload  state.  Unfortu¬ 
nately,  limited  test  scenario  resources  prevented  repeated  testing  in  this  experiment. 
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Time  (min) 


Figure  5.  Workload  estimates  produced  every  15  sec  by  the  general  and  individual 
regression  models  developed  from  the  15  subjective  workload  estimation  points. 

Subject  2 


Figure  6.  Moment-to-moment  estimated  workload  data  for  Subject  2  based  upon 
the  general  model  and  an  individually  tailored  model.  Original  workload  estimates 
provided  by  the  subject  during  the  scenario  are  shown  as  open  circles. 
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EXPERIMENT  2 


The  moment-to-moment  workload  measure  used  in  the  present  series  of  studies  was  useful  in 
identifying  transient  workload  peaks  and  for  modeling  of  workload  as  it  related  to  task-based  envi¬ 
ronmental  measures.  Unidimensional  workload  measures  have  been  criticized  for  lacking  informa¬ 
tion  regarding  what  specific  aspect  of  work  (e.g.,  physical  work,  mental  effort,  temporal  demand)  is 
most  affected  by  a  given  system  and  situation,  prompting  wide  use  of  multidimensional  workload 
scales  such  as  NASA-TLX  (Hart  and  Staveland,  1988).  Because  there  are  many  situations  in  which 
multidimensional  measures  cannot  be  administered  during  an  ongoing  scenario,  it  is  useful  to  under¬ 
stand  how  the  local-estimate  unidimensional  and  summary  multidimensional  measures  relate  to  each 
other. 

During  the  course  of  usability  testing  of  our  prototype  command  and  control  console,  moment-to- 
moment  and  summary  NASA-TLX  data  were  obtained  from  20  subjects.  Time  constraints  did  not 
permit  the  establishment  of  the  relative  TLX  subscale  weighting  factors  as  originally  prescribed  by 
Hart  and  Staveland  (used  for  establishing  a  single  weighted  score  from  pairwise  comparisons  of  the 
six  subscales  by  every  subject).  However,  it  is  common  practice  to  use  a  simple  average  of  the  sub¬ 
scale  measures  as  at  least  one  study  (Nygren,  1991)  has  found  no  advantage  to  using  the  weighted 
TLX  score  over  a  simple  average.  Experiment  2  examined  the  relationship  between  the  unidimen¬ 
sional  and  multidimensional  NASA-TLX  measures. 

METHODS 

Participants  &  Procedure:  Twenty  subjects  voluntarily  participated.  They  completed  between  30 
and  40  min  of  the  ADW  task  described  previously.  Between  15  and  20  unidimensional  workload 
estimation  scores  were  obtained  during  the  scenario,  as  well  as  the  NASA-TLX  measures  at  the  con¬ 
clusion  of  the  session. 

RESULTS  AND  DISCUSSION 

Across  all  subjects,  the  correlation  between  the  mean  TLX  score  and  the  mean  moment-to-moment 
workload  estimation  measure  was  weak  and  not  statistically  significant  (r  =  0.18,  p  >  0.05).  Inspec¬ 
tion  of  the  TLX  subscale  data  and  the  real-time  estimates  prompted  a  more  thorough  examination  of 
their  relationship.  Table  2  presents  the  correlation  matrix  for  the  90th  percentile  of  the  moment-to- 
moment  measures  and  the  TLX  subscale  measures  across  the  subject  sample.  There  was  considerable 
variability  for  correlations  between  the  TLX  subscales  and  the  real-time  workload  estimation  meas¬ 
ures.  The  90th  percentile  of  the  unidimensional  measures  were  calculated  because  of  the  suspicion 
that  the  NASA-TLX  measures  would  reflect  the  highest  workload  levels  experienced  during  the 
session.  The  maximum  and  the  mean  of  the  unidimensional  measures  were  found  to  be  far  less 
indicative  as  the  90th  percentile  measure  to  changes  in  the  TLX  subscale  mean  data. 
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Table  2.  Correlation  matrix  of  workload  estimates  and  NASA-TLX  subscale  means. 


90th  EFFORT  PERFORM 
Perc 

FRUSTRA 

TEMDEM 

MENDEM 

PHYSDEM 

90th  Perc 

1 .00  0.58 

-0.27 

-0.02 

0.49 

0.45 

0.17 

EFFORT 

1.00 

-0.17 

-0.12 

0.39 

0.46 

0.71 

PERFORM 

1.00 

-0.08 

-0.20 

-0.18 

-0.20 

FRUSTRA 

1.00 

0.21 

0.18 

-0.13 

TEMDEM 

1.00 

0.32 

0.39 

MENDEM 

1.00 

0.27 

PHYSDEM 

1.00 

PERFORM,  performance;  FRUSTRA,  frustration;  TEMDEM,  temporal  demand;  MENDEM, 
mental  demand;  PHYSDEM,  physical  demand 

The  correlation  between  the  TLX  subscales  of  mental  effort  and  physical  demand  (r  =  0.71) 
approached  a  multicolinearity  condition  (in  which  predictor  variables  are  highly  correlate  with  each 
other)  and  proved  troublesome  during  initial  regression  analyses.  The  physical  demand  subscale  data 
was  excluded  from  further  analyses,  as  is  often  necessary  in  such  cases  in  order  to  derive  robust  and 
generalizable  models  (Berry  and  Feldman,  1985).  A  forward-stepwise  regression  procedure  was  used 
to  estimate  the  90th  percentile  workload  estimation  data  from  the  remaining  TLX  subscale  means  for 
the  20  subjects.  The  procedure  yielded  a  significant  model  (F(2,  17)  =  6.20,  p  <  0.01;  R  =  0.65)  con¬ 
taining  the  TLX  subscale  means  of  mental  effort  load  and  temporal  demand  load.  The  model 
accounted  for  42  percent  of  the  variance  in  90th  percentile  moment-to-moment  workload  estimation 
measures.  The  model  is  expressed  as  follows: 

WKLD90*  =  0.11  *  Mental  Effort  +  0. 10  *  Temporal  Demand  +  1 .85 

Essentially,  this  model  transforms  TLX  subscale  ratings  on  a  20-point  scale  into  a  unidimensional 
90th  percentile  estimate  of  workload  on  a  7-point  scale.  Mental  effort  and  temporal  demand  accounted 
for  34  and  8  percent  of  the  variance,  respectively.  Increases  in  mental  effort  as  a  function  of  overall 
workload  were  likely  due  to  decisions  regarding  which  tasks  to  ignore  versus  execute  as  task  load 
increased.  These  data  indicate  that  automation  might  be  useful  in  assisting  operators  during  peak  task 
load  periods. 

The  relationship  between  moment-to-moment  workload  estimates  and  TLX  subscale  measures 
described  above  enables  some  estimation  of  which  aspects  of  work  are  changing  within  a  situation 
monitored  with  a  unidimensional  measure.  The  relationship  also  raises  questions  about  the  simple 
averaging  of  TLX  subscales  to  form  a  summary  workload  measure.  Such  an  approach  may  have  been 
appropriate  within  Nygren’s  (1991)  experimental  paradigm;  however,  data  within  Table  2  might  indi¬ 
cate  that  simple  averaging  is  not  appropriate  within  the  present  application.  It  is  quite  possible  that 
TLX  subscales  and  weightings  would  vary  between  legacy  and  protot3q)e  command  and  control 
interfaces,  as  they  would  likely  differ  with  respect  to  decision  support  features  such  as  task 
management  and  other  track  history  tools.  Thus,  care  must  be  exercised  when  interpreting  TLX 
means  and  subscale  data.  Modeling  to  unidimensional  measures,  as  conducted  in  the  present  study,  is 
necessary  if  moment-to-moment  measures  are  used  and  different  conditions  are  evaluated. 
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SUMMARY 

The  present  findings  indicate  that  task-centric  design  principles  and  task  management  human- 
computer  interface  (HCI),  in  combination  with  other  task  load  data  (such  as  track  density),  provide 
useful  information  that  correlates  strongly  to  operator  workload.  Despite  concerns  raised  by  Tsang 
and  Wilson  (1997)  over  the  limitations  of  unidimensional  workload  scales,  the  assessment  instrument 
used  in  our  study  was  useful  and  non-intrusive,  enabling  the  development  of  reasonable  workload 
models  that  could  then  be  interpolated  to  produce  moment-to-moment  workload  estimates.  Instruc¬ 
tions  to  subjects  have  recently  been  revised  (requiring  immediate  response  to  the  probe  when  it 
appears  on  the  display)  reducing  the  need  to  lag  the  estimates  backwards  in  time  from  60  to  20 
seconds.  A  voice-input  version  of  the  estimation  scale  (following  an  auditory  or  visual  icon)  would 
enable  use  of  fractional  values  and  impart  even  less  task  interruption. 

Recently,  the  task  and  track  density  data  have  been  augmented  with  operator  activity  data.  Simply 
obtaining  the  number  of  items  selected  by  the  operator  within  overlapping  30-sec  intervals  provides 
indication  of  general  operator  use  of  the  console.  Task  load  and  selection  activity  data  from  one  sub¬ 
ject  are  presented  in  Figure  7.  Also  presented  in  Figure  7  are  the  subjective  workload  estimates  pro¬ 
vided  by  the  subject  during  the  40-min  scenario.  The  data  demonstrate  some  expected  patterns:  High 
concurrent  activity  and  task  load  levels  were  associated  with  higher  subjective  workload  estimates. 
Lower  activity  and  task  loading  levels  were  associated  with  lower  workload  estimates.  Of  greater 
interest  was  the  observation  that  intermediate  workload  estimates  were  often  provided  when  selection 
activity  was  high  and  task  loading  was  minimal.  Regression  modeling  indicated  that  target  density, 
task  loading,  and  selection  activity  accounted  for  70  percent  of  the  variance  in  estimated  workload 
for  this  subject,  F(3,17)  =  13.4,  p  <  .001;  R  =  0.84.  The  regression  analysis  indicated  that  each  of  the 
input  variables  contributed  significantly  to  explaining  variance  in  estimated  workload. 

The  regression  model,  based  upon  the  20  subjective  workload  estimation  points,  was  then  used  to 
interpolate  estimated  workload  at  15-sec  intervals  throughout  the  entire  40-minute  period.  Figure  8 
presents  the  estimated  workload,  based  upon  the  track  density,  task  load,  and  selection  activity  data 
streams,  for  the  subject.  The  output  does  contain  some  noise  that  could  be  smoothed  in  a  real-time 
system.  The  results  indicate  that  simple  measures  of  operator  activity  can  contribute  significantly  to 
the  estimation  of  workload;  research  continues  to  examine  the  relative  contributions  of  various  meas¬ 
ures  to  the  estimation  of  operator  workload. 

The  simple  task-weighting  scheme  used  in  the  present  study  could  easily  be  expanded  to  account 
for  greater  workload  diversity  between  disparate  tasks.  It  is  conceivable  that  some  tasks  might 
require  distinct  weighting  factors,  adjustable  depending  upon  the  extent  of  automation  support  and 
appropriate  automation  dialog.  A  recent  study  by  Vrendenburgh  et  al.  (2000)  used  weighted  tasks  to 
derive  local  estimates  of  anesthesiologist’s  workload  during  actual  anesthetic  cases.  Limited  support 
from  automation  and  a  wide  variety  of  monitoring  and  problem-solving  tasks  resulted  in  significant 
variance  of  task  weightings  in  this  domain. 

Finally,  patterns  of  operator  activity  may  offer  further  indication  of  workload  and  operator  state.  In 
the  presence  of  pending  tasks,  repeated  sampling  of  information  regarding  a  particular  track  or  func¬ 
tion  might  indicate  elevated  workload  due  to  higher  concern  or  confusion.  In  such  cases,  it  might  be 
possible  to  derive  some  measure  of  work  efficiency.  Similarly,  sampling  from  a  variety  of  tracks 
might  indicate  effort  to  more  generally  understand  the  tactical  situation  and  predict  future  events. 
Workload  associated  with  each  of  these  activity  profiles  could  be  determined  and  thus  improve  over¬ 
all  workload  estimation  performance. 
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Frequency  (tasks  or  selections) 


Subject  21  Task  Load  and  Activity  Data 


Figure  7.  Task  load  (solid  line)  and  selection  activity  (dashed  line)  as  a  function  of  time  for 
one  subject.  Subjective  estimates  of  workload  are  presented  as  filled  squares. 


Estimated  Workload  Series  for  Subject  21 


Figure  8.  Regression  model-generated  estimated  workload  profile  for  one  subject  during  a  40-min 
ADW  scenario  based  upon  continuous  track  density,  task  load,  and  operator  activity  measures. 


Understanding  how  the  interactions  between  the  system,  the  operator,  and  the  environment  con¬ 
tribute  to  elevations  in  operator  workload  allows  human  factors  engineers  to  more  efficiently  develop 
systems  and  apply  automated  processes.  The  approach  used  herein  is  useful  during  the  prototyping 
stage  of  system  development,  and  as  a  real-time  method  of  establishing  workload  from  multiple 
operators  for  supervisory  review  and  for  intervention  by  automated  processes. 
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